Yahoo!Clusty - Adding real-time clustering functionality to the Yahoo! web search engine

نویسنده

  • Giuseppe Narzisi
چکیده

Yahoo!Clusty1 is a Clustering Meta-search Engine (MSE) that allows users to send queries to Yahoo!. The returned snippets are grouped into homogeneous groups by topic. The objective of this project has been to create a flexible MSE for the Yahoo! web search engine. The purpose is to present the results returned to a query in a more structured format which will allow the user to easily explore them by category. The basic idea, which is has been recently become a focus of attention in the information retrieval community [6, 7], is to consider only the snippets of the returned web pages as a consistent representation of each page and grouping them in homogeneous clusters by means of clustering and categorization algorithms. The processing must be done on the fly at run-time, so it requires efficient implementation and design of technologies and algorithms in order to minimize the latency between the issuing of the query and the presentation of the results. Many different approaches have been presented in the last 10 years (Copernic, Dogpile, iBoogie, Kartoo, Mooter, Vivisimo, etc.) and many academic prototypes have been explored as well. A recent example is given by the Armil2 [1] meta-search engine. Given the limited amount of time and the complexity of the project, the goal is not to develop a sophisticated MSE that can outperform all the previous MSEs but to create a flexible platform for testing various clustering algorithms and labeling techniques on snippets and show that all this can be achieved in a one semester project. Moreover the system has been developed in such a way that it can be easily extended with more functionalities.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ارزیابی وب گاه دانشگاه علوم پزشکی تهران براساس معیارهای وب سنجی در سال 2008

Background And Aim: Nowadays university websites are very important in information services. There fore university has designed website for categorizing and availability of mass of information . This study accomplish to purpose evaluated of Tehran university of medicine sciences website base on webometrics criteria on 2008 . Materials and Methods: This survey have been used link analysis metho...

متن کامل

Making Web Results Relevant with SAS ®

Many companies search the Web to learn about their competition and understand their potential customers. But how accurate are these search results? For instance, have you ever submitted the query "SAS", only to get results back about "Scandinavian Airline Systems"? This paper presents a SAS-based solution to accessing and clustering Yahoo! search engine results by using SAS Text Miner. We demon...

متن کامل

"openness of search engine": A critical flaw in search systems; a case study on google, yahoo and bing

There is no doubt that Search Engines are playing a great role in Internet usage. But all the top search engines Google, Yahoo and Bing are having a critical flaw called “Openness of a Search Engine”. An Internet user should be allowed to get the search results only when requested through Search engine’s web page but the user must not be allowed to get the search results when requested through ...

متن کامل

Yahoo! Learning to Rank Challenge Overview

Learning to rank for information retrieval has gained a lot of interest in the recent years but there is a lack for large real-world datasets to benchmark algorithms. That led us to publicly release two datasets used internally at Yahoo! for learning the web search ranking function. To promote these datasets and foster the development of state-of-the-art learning to rank algorithms, we organize...

متن کامل

Yahoo!Search and Web API Utilized Mashup based e-Leaning Content Search Engine for Mobile Learning

Mashup based content search engine for mobile devices is proposed. Mashup technology is defined as search engine with plural different APIs. Mash-up has not only the plural APIs, but also the following specific features, (1) it enables classifications of the contents in concern by using web 2.0, (2) it may use API from the different sites, (3) it allows information retrievals from both sides of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008